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ABSTRACT 


The  results  of  a number  of  pattern  recognition  experiments 
designed  to  evaluate  the  performance  of  the  CONFLEX  I system 
in  classifying  sonar  contacts  are  reported.  CONFLEX  I is  a 
laboratory  experimental  pattern- recognition  system  developed 
under  Air  Force  contract.  Theoretical  measures  of  system  perfor- 
mance are  reviewed  as  background  for  the  experiment  evaluations. 


Utilizing  a series  of  specially  formatted  photographic 
transparencies  representing  132  submarine  and  73  nonsubmarine 
returns,  three  major  experiments  were  run  on  the  CONFLEX  I sys- 
tem: A closed-ended  experiment  to  test  the  system's  ability 

to  separate  two  classes;  an  open-ended  experiment  to  measure 
the  system's  ability  to  categorize  inputs  which  were  not  used 
in  training;  a test  of  the  system's  performance  when  subdividing 
the  submarine  returns  into  aspect  groups. 


Classifications  in  these  experiments  were  94.1%,  85.9%  and 
97.6%  correct,  respectively.  Distribution  plots,  "ROC"  curves, 
and  computed  probabilities  of  correct  response  are  included  in 
the  experiment  evaluations. 


Signal  preprocessing  and  a^Lide  preparation  were  greatly 
aided  by  the  computing  facilities  and  personnel  of  the  Applied 
Mathematics  Laboratory  of  the  David  Taylor  Model  Basin. 
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NFIDINTI 

INTRODUCTION 


Acoustic  signals  received  by  active  sonar  equipment  have 
for  years  been  a major  source  of  underwater  threat  information 
for  the  human  classifier.  A number  of  interesting  techniques 
for  acquiring  and  processing  target  information  have  been  in- 
vestigated, but  classification  concepts  utilizing  information 
derived  from  the  sonar  return  continue  to  offer  the  greatest 
promise . 

Automatic,  rapid,  and  accurate  classification  of  sonar 
contacts  in  the  underwater  environment  is  an  urgent  ASW  require- 
ment and  is  the  focus  of  current  research.  An  extensive  effort 
is  being  made  to  develop  a data  processor  in  which  the  advanced 
capabilities  (e.g. , greater  source  levels  and  mode  and  signal 
flexibility)  of  the  newer  sonar  equipment  are  exploited. 
Satisfactory  achievement  of  this  objective  has  proven  un- 
usually difficult. 

This  final  report  presents  the  results  of  classification 
experiments  which,  although  necessarily  limited,  were  designed 
to  evaluate  the  capability  of  a pattern-recognition  system, 
CONFLEX  I,  to  distinguish  submarine  from  nonsubmarine  sonar 
returns.  CONPLEX  I is  a unique  pattern-recognition  system 
that  implements  a concept  referred  to  as  "conditioned-ref lex. " 
Such  a system  is  "conditioned"  by  allowing  the  processor  to 
derive  and  store  in  its  memory  a reference  function  for  a 
given  set  of  multivariable  input  patterns  or  stimuli.  The 
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"response"  to  an  unknown  input,  which  is  similarly  processed, 
is  to  associate  the  input  with  the  set  whose  reference  function 
shows  the  greatest  cross-correlation.  The  concept  has  proven 
exceptionally  effective  in  various  pattern-recognition  problems, 
many  of  which  are  similar  to  those  encountered  in  sonar  return 
classification. 

The  experiments  performed  under  this  study  contract  could 
not,  within  the  limitations  of  the  contract,  be  expected  to  be 
conclusive.  Nevertheless,  we  were  encouraged  by  the  results* 
of  those  experiments  which  we  were  able  to  perform.  Section  I 
reviews  important  aspects  of  the  conditioned-reflex  concept. 

Much  of  this  material  has  been  presented  previously  in  docu- 
ments generated  by  SCOPE  but  is  included  in  this  report  as 
reference  for  the  system  performance  evaluation  and  as  back- 
ground for  those  not  familiar  with  our  work.  A complete  treat- 
ment is  contained  in  reference  1.  We  begin  with  a description 
of  the  CONFLEX  I system;  however,  readers  familiar  with  the 
system  can  proceed  directly  to  section  II. 
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SECTION  I 


SUMMARY,  CONDITIONED-REFLEX  THEORY  AND  ITS 
IMPLEMENTATION  IN  CONFLEX  I 


M D 

f n 


CONFLEX  I,  shown  in  figure  1,  has  been  designed  for  labora- 
tory experiments  in  pattern  recognition.  This  system  consists 
of  an  optical  input  (receptor)  device  and  a digital  data  processor 
that  transforms  the  input  data,  utilizing  cross-correlation  de- 
cision criteria.  In  the  present  configuration  of  the  CONFLEX, 
inputs  are  presented  to  the  receptor  by  projecting  35  mm  slides 
onto  an  array  of  photo-resistors.  This  method  of  presenting 
data  has  proven  useful  in  a wide  range  of  experimentations,  but, 
of  course,  is  not  the  type  one  would  design  in  a system  for 
processing  transient  signals  such  as  those  encountered  in  the 
sonar  environment.  Nevertheless,  the  method  is  often  expedient 
in  attempts  to  evaluate  the  applicability  of  the  CONFLEX  concept 
during  preliminary  explorations  of  a new  problem.  Such  was  the 
case  in  this  study  program.  As  a matter  of  interest,  the  work 
reported  herein  was  the  forerunner  of  a larger-scale.  program, 
with  similar  objectives,  which  is  now  in  progress.  In  this 
program,  many  thousands  of  examples  will  be  processed  utilizing 
a more  efficient  buffer  between  the  magnetic  tape  records  and 
the  CONFLEX  system. 

The  mode  of  operation  in  CONFLEX  I is  sequential  but  effi- 
cient for  most  laboratory  experiments.  System  logic  is  imple- 
mented with  one-megacycle  plug-in  module  cards,  and  the  memory 


I 


I 


is  a 500,000  bit  magnetic  disc.  All  clock  and  timing  informa- 
tion is  derived  from  the  memory  disc.  In  CONFLEX  I,  the  clock 
rate  is  about  300  kc  (corresponding  to  60  rps  and  5100  bits/ 
revolution) . 


LABORATORY  MODEL  OPERATIONS 

In  the  laboratory  model,  information  is  extracted  from  input 
patterns  sequentially  by  means  of  a linear  threshold  circuit 
(the  D-cell )*?  Connected  to  this  circuit  during  each  clock 
interval  is  a unique  random  sample  of  outputs  from  the  sensory 
system.  The  particular  random  connections  depend  on  the  states 
of  several  linear  feedback  shift  registers.  As  these  shift 
registers  are  strobed  by  the  system  clock,  random,  but  repeatable, 
connections  are  made  at  the  rate  of  thirty  million  per  second. 

The  resulting  sequence  of  D-cell  outputs  characterizes  the  in- 
put signal. 

In  the  LEARN,  or  adaptive,  phase  of  system  operation,  the 
D-cell  sequences  are  combined  by  the  data  processor  to  form  a 
reference  sequence  associated  with  each  pattern  class.  The 
assignment  of  inputs  to  a class  and  storage  of  references  are 
under  operator  control  during  the  LEARN  phase. 

CONFLEX  I automatically  classifies  an  "unknown"  input  in 
the  RECOGNIZE  phase  of  system  operation.  The  decision  function 
is  based  upon  a cross-correlation  between  the  D-cell  sequence 
for  the  unknown  and  each  of  the  stored  reference  sequences. 

Class  assignment  of  the  unknown  normally  corresponds  to  the 
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reference  yielding  the  highest  positive  correlation.  Correla- 
tion values  can  be  compared  with  a fixed  threshold  in  an  alter- 
native mode  of  operation. 

Depending  on  the  mode  of  reference  storage,  6,  24,  or  48 
classes  are  available,  and  correlations  are  performed  in  about 
16.5  msec  per  class.  Hence,  about  3/4  second  is  required  to 
choose  from  among  48  classes  the  assignment  of  an  unknown. 


DATA  PROCESSOR  OPERATIONS 

Given  a set  of  possible  inputs,  together  with  a specified 
class  assignment  for  each,  the  data  processor  derives  and  stores 
data  later  used  to  automatically  decide  the  class  of  an  unknown 
input.  The  basic  decision  process  consists  of  comparing  the 
data  derived  from  the  unknown  input  with  the  stored  data  for 
each  class.  Expected  inputs  are  generally  represented  by  a 
relatively  large  number  of  sample  points. 

In  the  conditioned-reflex  processor  (CR  System) , the  deci- 
sion function  is  implemented  by  cross-correlation.  Stored  within 
the  system  is  reference  data  corresponding  to  each  class.  The 
manner  in  which  data  is  derived  from  the  inputs  and  combined  to 
form  the  references  is  such  as  to  make  cross-correlation  highly 
efficient  for  a large  number  of  anticipated  recognition  problems. 

BASIC  ORGANIZATION  OF  THE  CR  SYSTEM 

Figure  2 is  a block  diagram  of  the  basic  organization  of 
the  conditioned-reflex  system.  Since  CONFLEX  I has  an  optical 
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receptor,  we  will  discuss  the  model  in  terms  of  visual  input 
patterns  (stimuli).  The  stimuli  are  placed  on  a two-dimen- 
sional field  (the  sensory  field),  subdivided  into  N sensory 

O 

resolution  cells  (S-cells).  A stimulus  may  be  represented  by 

a vector,  S.,  in  a space  of  N dimensions,  where  the  value  of 
1 S 

each  component  depends  upon  the  light  intensity  on  the  corre- 
sponding S-cell.  For  a white-black  pattern,  these  components 
may  be  taken  as  unity  or  zero. 


The  sensory  field  (S-field)  is  connected  to  a second  field 

called  the  discrimination  field  (D-field).  These  connections 

are  made  in  such  a way  that  vector  S . is  transformed  into  a new 

3 

vector,  D^,  with  elementary  components.  Each  component  has 
a value  determined  by  the  output  of  the  corresponding  discrimina- 
tion cell  (D-cell).  For  example,  this  output  d^  could  be  +1, 

0,  or  -1  when  the  algebraic  sum  of  its  several  inputs  is  greater 
than  zero,  equal  to  zero,  or  less  than  zero,  respectively.  In 
this  case,  we  refer  to  a simple  linear  threshold  logic.  Other 
D-field  logics  are  generally  used,  yielding  superior  experimental 
results . 


A D-cell  receives  a random  selection  of  inputs  from  the 
S-field.  This  method  of  connection  makes  two  D-field  responses 
substantially  different  (as  measured  by  a comparison  of  the 
corresponding  D-cell  outputs)  even  when  the  stimuli  are  similar. 
It  is  this  property  of  the  system  that  allows  reliable  separation 
of  similar  inputs.  The  intended  response  of  the  system  when  a 
stimulus  is  applied  is  to  select  the  appropriate  one  of  a set  of 
Nc  responses  R^,  ...,  The  choice  of  class  assignment, 
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C_.  (therefore,  response  R ) , involves  the  correlation  of  the 
linear  threshold  measurements  made  on  the  unknown  pattern  by 
the  D-cells,  with  each  of  a number  of  reference  functions  stored 
in  the  memory  fields  (M-fields).  These  references  can  be  visual- 
ized as  a set  of  vectors  (M- vectors ) in  a multidimensional  space 
or,  alternatively,  as  a set  of  planes  through  its  origin.  The 
set  of  linear  threshold  measurements,  cL  , made  upon  each  input 
are  the  components  of  the  vector  5^  drawn  from  the  origin  of 
the  space  to  the  point  having  coordinates  ^d^,  d^,  •••'  d^^). 


Each  reference  vector  is  formed  by  adding  vectorially  the 
D-vectors  corresponding  to  the  set  of  inputs  to  be  classified 
together.  The  components  of  M- vector  M are  therefore  given  by 


M 


j = Cmjl'  mj2 ' 


m 


jN, 


b> 


(1) 


N„ 


where 


m 


U ■ I 


k=l 


ki 


and 


i = 1,  2, 


N. 


D’ 


In  this  example,  N input  patterns  are  used  to  construct  the 
til  ® 

j reference  function.  Note  that  for  each  reference  there  is 
one  m_.^  corresponding  to  each  D-cell  d^ . . 


i.e, 


Figure  3 shows  a simple  example  in  only  three  dimensions; 
= 3.  The  general  equation  for  a plane  through  the 


origin  is  given  by  ax^  + bx^  + cx^  = 0 


When  the  coefficients, 
a,  b,  c,  are  replaced  by  the  coordinates  of  M,  the  resulting 
plane  is  perpendicular  to  M;  that  is,  M is  normal  to  the  plane. 


mx  + rn  x„  + m x„  = 0. 
1 1 2 2 3 3 


(2) 
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Figure  3.  Geometrical  Interpretation  of  Classification  Criterion 


til 

The  correlation  of  an  input  with  the  j reference  is  given  by 

N0 

(3) 


$ . =5  * M . — V d . m . , 

x]  x j L xi  ji 


i=l 

D^‘ M ^ is  a vector  dot  product.  This  is  equivalent  to  replacing 

the  x.'s  with  the  d ,’s  in  equation  2.  In  the  three-dimensional 
i xi 

system. 


$ = m,d,  + m d„  + m„ d., 
1 1 2 2 3 3 


(4) 


I — 2 5 5 — 

Except  for  a normalizing  factor  Im  + m ‘ + n»3  , this  is 
the  distance  from  the  point  d^,  d^j  to  the  plane  defined 

by  equation  2.  In  the  CR  System,  the  unknown  may  be  associated 
with  the  reference  yielding  the  largest  #. . It  is  therefore 
assigned  by  the  comparator  (figure  2)  to  class  C^  when  the 
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reference,  whose  vector  M.  is  closest  to  D or,  alternatively, 

ND 

whose  plane  0 = ^ m „ x^  is  farthest  from  the  point  dx2' 


i=l 


dxNE>)* 


r 

U 


A two-class  decision  problem  using  vector  notation  is 
illustrated  in  figure  4.  In  this  example  the  unknown,  repre- 
sented by  5 , would  be  associated  with  class  2.  Note  that  the 
X r 

plane  (equation  2)  is  optimum  with  respect  to  the  point 
m2'  m3^  *n  tlle  sense  no  other  plane  can  be  drawn  through 

the  origin  and  be  as  far  from  the  point.  However,  the  plane 
may  not  be  optimum  with  respect  to  a point  (^d^,  (32'  d3^};  for 
this  reason,  incorrect  class  assignment  is  possible. 


s ; 

I 

F I 


I 


0 

C 

0 

0 

0 

D 


Figure  4.  Two-class  Decision  Problem 
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is  the  D-vector  generated 
There  are  two  cases  of  interest 


REVIEW  OF  IMPORTANT  ANALYTICAL  RESULTS 

The  decision  process  has  been  described  as  one  of  calculat- 
ing the  several  values  of  the  vector  dot  products  given  by  D^-M^ 
and  associating  the  unknown  input  with  the  class  whose  M-vector 
yields  the  largest  dot  product, 
by  the  system  for  the  unknown  S^ 

first,  5 ' M . , where  S is  not  in  class  C.  ( i.e.,  5 i M,  );  second, 
x 3 _ x 3_>  _ xT  :y 

D *M.,  where  S is  a member  of  C.  (i.e.,  D e M.  ).  The  first  case 
x 3 x — 3 V x 3^ 

is  associated  with  the  noise  in  the  decision  process,  and  the 
second  is  identified  with  the  signal.  The  dot  products  are  ran- 
dom variables,  the  values  of  which  are  the  possible  results  of 
computing  D^*M_.  for  a large  ensemble  of  machines  identically 
conditioned  and  organized  according  to  the  same  probabilistic 
specification. 


Probability  of  Correct  Response 


Figure  2 shows  the  several  dot  products  entering  the  com- 
parator. Let  S be  the  actual  value  of  comparator  input  contain- 
ing the  signal;  that  is,  S = D^-M_.,  where  5^  is  one  of  the 
D-vectors  originally  used  to  construct  M_. . The  response  will 
be  correct  only  if  none  of  the  N -1  other  inputs  to  the  compara- 
tor exceeds  S.  If  each  of  the  other  inputs  has  mean  u and 
2 

variance  oN  , the  probability  of  the  joint  event  that  none  of 


the  other  inputs  should  exceed  S is 


P(S)  = 


— 

r s r v 

1 

aN/2^ 

exp  - „ dx 

2a  „ 

N 

— 00 

V1 


(5) 
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provided  the  comparator  inputs  can  be  described  by  statistically 
independent  Gaussian  processes. 


For  S,  a normally  distributed  random  variable  with  mean  u 

2 S 
and  variance  a , the  probability  of  correct  response  is  derived 

b 

by  averaging  P(S)  over  S as  follows: 


Pc  a 


P ( S ) exp  - 


We  would  like  to  have  an  expression  that  is  more  easily  evaluated 

for  various  values  of  p and  a.  To  obtain  an  underestimate  of  P , 

c 

we  replace  S by  its  mean  value,  p , and  augment  the  variance  of 

b 

the  N -1  other  inputs  to  the  comparator  by  the  variance  of  S. 

C 2 2 2 
Then  each  of  the  other  inputs  has  variance  = a + a , but 

^ T S N 

the  Nc~l  new  random  variables  are  no  longer  independent.  Neglect- 
ing this  fact  (the  resulting  error  is  conservative),  the  joint 

probability  that  none  of  the  other  inputs  exceeds  p is 

b 


P'  = 
c 


aT/7rr 


exp  - 


(x-0: 


Pc-1 


Changing  variables  by  letting  y = 
normal  form 


transforms  P into 

o c 

T 


P'  = 
c 


Pcf1 
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^S_^N 

where  P = . The  probability  of  correct  response  therefore 

T 

becomes  a tabulated  function  of  P.  Values  of  P'  as  a function 

c 

of  r can  be  found  in  most  mathematics  handbooks  and  in  texts 
on  probability  and  statistics  (references  2 and  3).  An  abbre- 
viated table  is  included  here  as  an  appendix. 

Hereafter,  T will  be  referred  to  as  the  signal-to-noise 
ratio.  It  is  used  extensively  in  reference  1 for  comparison  of 
variations  of  the  CR  model.  In  the  following  paragraphs,  P is 
computed  for  two  cases  of  special  interest.  It  is  used  in 
section  IV  to  evaluate  experimental  results. 


Unclipped  Reference  Vectors 


Unclipped  reference  vectors  M_.  are  constructed  by  vector 
addition  of  the  D-vectors  corresponding  to  the  training  stimuli 
assigned  to  class  C.. . Each  M_.  is  therefore  given  by 

M.  = D . + 5 . + •••  + D . (vector  sum),  (9) 

1 !l  3 2 3Ne 


where  N is  the  number  of  stimuli  used  in  forming  the  class 

® til 

reference.  Since  equation  9 is  vector  addition,  the  i com- 
ponent of  M,  is  written 


m . . 


d + d + 
ill  i2i 


+ d . 

INr1 


(10) 


In  order  to  compute  T,  one  investigates  the  mean  and  vari- 
ance of  the  dot  products  for  each  of  the  noise  and  signal 

cases.  Hence,  the  statistics  of  the  products  d.  .m..  are  re- 

]xi  31 

quired.  It  is  shown  in  reference  1 that  for  a CR  system  in  which 
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the  D-cell  outputs  are  uncorrelated,  random  variables  with  zero 
means  that  the  signal-to-noise  ratio  is 

2 ND 

T = (unclipped  reference  vectors).  (11) 

E 

It  is  also  shown  that  equation  11  holds  when  a majority  decision 
is  made  such  that  a D-cell  output  is  +1  when  the  sum  of  its  in- 
puts is  positive,  -1  when  the  sum  is  negative,  and  zero  otherwise 

Our  estimate  of  the  probability  of  correct  response 
(equation  8)  is 

N -1 

P'  = (0.9987)  (12) 

c 

for  T = 3.  Of  particular  importance  to  the  design  of  the  CR 

system  is  the  number  of  D-cells,  Np.  Equation  11  can  be  used 

to  determine  the  required  for  a given  F and  number  of  stimuli 

per  class,  N . Hence,  for  example,  if  F = 3 and  N = 100,  we 
E £ 

have  N = 1800,  For  N = 20,  our  estimate  gives  P'  = 0.9756; 

D C ^ c 

that  is,  on  the  average,  about  50  of  the  2000  stimuli  used  to 
condition  the  system  would  be  misciassif ied. 

In  a digital  system,  the  number  of  bits  of  storage  required 
for  the  reference  vectors  in  the  unclipped  case  is  equal  to 

log2(2NE).  A method  for  substantially  reducing  this  require 
ment  relates  to  a "clipped  system,"  described  next. 
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In  the  clipped  system,  these  components  are  replaced  by  binary 

variables  m__  having  integral  values  either  +1  or  -1.  The 

clipping  operation  is  such  that  m__  = +1  when  m_.^  ^ 0,  and 

m..  = -1  when  m..  < 0.  It  should  be  emphasized  that  this  opera- 
11  li 

tion  takes  place  only  after  the  corresponding  unclipped  reference 
vector  has  been  formed. 


The  signal-to-noise  ratio  for  this  clipped  system  is 

(clipped  reference  vectors).  (14) 


TTN 


Using  the  same  numbers  as  in  the  previous  example  (i.e.,  = 

1800;  N = 100),  the  signal-to-noise  ratio  for  this  case  becomes 
E 

P = 2.4.  The  probability  of  correct  response  (again  for  Nc  = 20) 

is  P'  = 0.8550.  About  290  of  the  2000  trained  stimuli  would  be 
c 

misclassif ied. 


It  is  instructive  to  compute  the  num.  ?r  of  D-cells  required 
in  the  clipped  system  to  yield  T = 3 (therefore,  the  same  as 
in  the  unclipped  example),  then,  to  compare  the  required  binary 
storage  for  the  two  systems; 

N (clipped)  = 2826  (15) 
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In  general,  NQ  (clipped)  = — N^  (unclipped).  Since  the  number 
of  bits  of  storage  required  for  the  clipped  system  is  equal 
simply  to  NDNC<  the  two  systems  compare  on  an  equal  basis 
as  follows: 

N N 

(required  storage  clipped)  _ C 2 D 

(required  storage  unclipped)  N N log,2N 

CD  Z JE 


21o?22Xe 


(16) 


For  N = 100,  the  clipped  system  required  about  23  percent  as 
E 

much  as  is  used  by  the  unclipped  model. 


This  economy  of  storage  is  used  to  good  advantage  in  CON- 
FLEX  I.  In  this  system,  six  classes  of  unclipped  storage  are 
replaced  by  48  classes  in  the  clipped  operating  mode. 


Replacement  of  Comparator  by  a Fixed  Threshold 

The  comparator  illustrated  in  figure  2 compares  the  N_, 
correlations  and  selects  the  response  corresponding  to 

the  memory  vector  producing  the  greatest  positive  correlation. 
An  alternative  mode  (implemented  in  CONFLEX  I)  is  that  in  which 
the  correlations  are  compared  with  a fixed  threshold,  T.  The 
rule  for  this  mode  of  operation  is  that  the  response  R^_  is  made 
if  5^-5^  * T. 

In  such  a system,  it  is  clearly  possible  to  produce  two 
or  more  simultaneous  responses,  a situation  repeatedly  encoun- 
tered in  statistical-decision  theory.  Two  types  of  errors  must 
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be  considered.  The  first  is  that  of  failure  to  produce  a re- 
sponse when,  in  fact,  6^  is  contained  in  M_..  The  probability 
of  this  error  is 


P 


1 


Pt 


V2" 


exp 


(17) 


The  second  type  of  error  is  that  of  incorrectly  producing  one 
or  more  responses,  R^,  when,  in  fact,  is  not  contained  in 
M..  The  probability  of  this  error  is 


p2  = 1 ~ 


2 

1 

aN/2n- 

exp  — ax 

2a 

N 

— GO 

Nc-1 


(18) 


The  probabilities  and  P ^ depend  upon  the  threshold  T,  and 
the  value  of  T can  be  chosen  in  a variety  of  ways.  For  example, 
if  losses  or  penalties  are  assigned  to  each  type  of  error,  one 
can  ask  for  the  value  of  T which  minimizes  the  overall  expected 
loss.  This  is  Bayes'  criterion. 


Alternatively,  one  may  stipulate  that,  say,  the  first  type 
of  error  is  to  be  held  below  a prescribed  level,  and  within 
this  restriction  a value  of  T is  to  be  chosen  which  minimizes 
the  second  type  of  error.  This  is  the  Neyman-Pearson  criterion. 

Using  a fixed  threshold,  even  with  the  most  favorable  choice 
of  T,  will  require  a greater  value  of  to  meet  a specified 
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overall  error  rate.  This  is  an  understandable  result  in  view 
of  the  added  constraint  imposed  by  the  fixed  threshold.  One 
advantage  of  using  a fixed  threshold  criterion  for  response 
selection  is  that  the  response  selection  logic  becomes  rela- 
tively easy  to  implement. 


SUMMARY  OF  MODEL  PARAMETERS 


CONFLEX  I can  be  used  as  an  unclipped  or  clipped  model, 
and  responses  can  be  selected  either  by  the  comparator  or,  if 
desired,  by  the  fixed- threshold  method.  It  is  also  possible  to 
"partially  clip"  the  M-vector  components  in  such  a way  that 


m . . 

.H 

+ 

II 

when 

a 

IV 

+ 

- 

li 

11 

- 

m . . 

= -1 

when 

m . . £ -t 

li 

li 

■ 

m . . 

= 0 

when 

i 

A 

3 

li 

11 

A discussion  of  this  general  clipping  mode  is  found  in  reference  4. 


The  number  of  D-cells  implemented  in  CONFLEX  I is  variable 
in  steps  from  500  to  5000.  In  the  clipped  mode,  48  classes 
are  possible,  while  six  are  implemented  in  the  unclipped  mode. 


Since  two  bits  per  m are  required  in  partially  clipped  opera- 


tion, 24  classes  are  possible  in  this  mode. 


The  sensory  system  in  CONFLEX  I consists  of  400  photo- 
resistors connected  in  a checkerboard  pattern  of  plus  and  minus 
contributors.  These  will  be  referred  to  as  excitatory  and  in- 
hibitory cells,  respectively.  Reference  1 shows  the  desirability 
of  having  a relatively  large  number  of  these  cells  connected  to 
each  D-cell. 
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evaluating  system  performance 

Equation  8,  which  gives  the  probability  of  correct  response 
under  controlled  experimental  conditions,  is  a generally  useful 
expression  for  estimating  system  performance.  Evaluation  of  this 
expression  requires  that  relatively  large  experiments  (i.e.,  num- 
bers of  patterns)  be  performed  to  obtain  statistically  signifi- 
cant data.  The  appendix  tabulates  the  probability  of  correct 
response,  P^,  in  a two-class  (Nc  = 2)  problem  for  r in  the  range 
0.0  5 r < 4.0.  The  values  are  plotted  in  figure  5. 


The  quantity  r*  which  was  derived  earlier,  is  repeated  here 


for  convenience: 


i>=  E f5-M.  ),  5 is  contained  in  M . 

V 3J  3 

(jtN  = E 5 is  not  contained  in 

°T  - J4  * °N  - 2C°V  C5-Rj'  °-\) 


where  the  prefix  E is  used  to  denote  expected  value. 

An  experimental  p may  be  computed  by  substituting  empirical 
results  in  the  above  expression.  In  the  following  example  of 
this  derivation,  correlation  values  are  given  for  a two-class 
experiment.  Ten  images  were  used  in  the  experiment  - five  to 
form  each  of  two  classes.  The  correlation  values  that  resulted 


I 


I 


In- 

-class 

Out-of-class 

Image  Number 

Signal 

Values 

00 

Noise  Values 

1 

66 

- 5 

2 

60 

22 

3 

33 

- 5 

4 

80 

13 

5 

57 

6 

6 

66 

20 

7 

57 

7 

8 

44 

16 

9 

62 

1 

10 

41 

16 

The  statistical  measures  appearing  in  equation  19  are  the 
mean,  p,  variance,  standard  deviation,  a,  and  covariance.  In 
evaluating  experimental  results,  we  use  sample  statistics  and 
compute  these  measures  using  the  following  expressions: 


Mean 


Variance 


Covariance 


I 


i=l 


x . 
1 


°2  - * I (v ;)2 
I Or  0 Or  0 


(20 

(21 

(22 


where  n is  the  total  number  of  sample  results  (e.g.,  ten  in  the 
above  example) . 
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Using  the  in-class  and  out-of-class  correlation  values 
listed  above,  we  find  that 


us 

= 56.6 

= mean  of  the 

in-class  values 

= 9.1 

= mean  of  the 

out-of-class  values 

2 

°S 

= 191.6 

= variance  of 

the  in-class  values 

2 

aN 

= 97.0 

= variance  of 

the  out-of-class  values 

Cov 

= 21.9 

= covariance  of  the  in-class  and  out-of- 
class  values 

Substituting  these 

values  into  the  expression  for  r» 

•n  — 

Cus~ 

Cms-  *0 

56.6-9.1 

^-2Cov  yi91. 6+97. 0-2(21. 9) 


Referring  to  the  appendix,  we  observe  that  a r of  3.0  indicates 
a probability  of  correct  response  equal  to  0.9987  or  an  expected 
error  rate  of  13  per  10,000  in  a two-class  experiment. 

This  example  illustrates  the  relatively  high  signal- to-noise 
ratio  that  results  when  relatively  few  D-vectors  are  used  to  form 
each  M-vector.  Statistically  significant  error-rate  data  is 
obtained  by  performing  hundreds  of  experiments  or,  alternatively, 
performing  relatively  few  experiments  but  using  a very  large 
number  of  images  per  class.  By  computing  the  experimental  r, 
experiments  can  be  designed  which  are  relatively  simple  and  yet 
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statistically  meaningful.  Hereinafter,  the  experimental  r is 
used  to  measure  system  performance  since  the  expected  error 

Li 

rate  may  be  readily  determined  from  either  figure  5 or  the 
appendix. 

D 

D 
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0 
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SECTION  II 


BASIS  FOR  CLASSIFICATION 


In  the  unrestricted  classification  problem,  one  attempts 
to  mechanize  mathematical  transformations  in  which  all  "similar 
patterns"  are  mapped  into  a single  representative  class  symbol. 
These  invariances  have  long  been  sought  by  engineers  and  mathe- 
maticians as  a means  of  obviating  complications  introduced  by 
variations  in  patterns;  e.g.,  medium  effects,  target  strength, 
aspect,  etc.,  related  to  the  sonar.  Most  of  the  transformations 
discovered  are  useful  only  with  pattern  environments  of  academic 
interest,  while  those  exhibiting  the  desired  generality  have 
proven  unusually  complex  when  they  are  considered  as  a basis 
for  machine  design. 

The  design  of  a classification  system  begins  with  the 
selection  of  a set  of  measurements.  The  measurements  are  usually 
selected  on  the  basis  of  the  best  available  physical  knowledge 
of  the  problem,  but  they  often  include  measurements  which  past 
experience  has  shown  to  be  useful.  Sonar  systems  include  a num- 
ber of  displays  that  serve  as  the  primary  source  of  classifica- 
tion information  for  the  experienced  operator.  The  sonarman 
speaks  of  echo  quality  and  doppler,  echo  strength,  and  echo 
length  when  listening  to  the  audio.  On  his  PPI  scope  he  looks 
for  pip  shape,  intensity,  target  angle  and  movement  to  give  him 
additional  clues.  From  his  graphic  recorder  he  looks  at  edge 
alignment,  length,  and  structural  highlights.  Analysis  of  these 
parameters  can  often  reveal  conclusively  the  presence  or  absence 


25- 


I 


I 


CONFIDENTIAL 


of  a submarine  target.  Generally,  however,  any  single  parameter 
is  insufficient,  and  one  expects  the  reliability  of  the  classifi- 
cation to  improve  as  more  parameters  are  orought  in. 

It  would  be  difficult  to  establish  exactly  hew  a given 
operator  mentally  processes  such  information  to  arrive  at  the 
contact  classification.  Even  when  they  are  presented  identical 
data,  two  observers  reach  their  respective  decisions  in  somewhat 
different  ways;  lor  example,  each  may  express  a substantially 
different  level  of  confidence  in  his  response  or  even  a differ- 
ent conclusion.  Subjectivity  is  reduced  when  the  decision  pro- 
cess is  performed  rigidly  and  uniformly  in  all  cases.  NEL ' s 
Flexchart  system,  HHIP  and  MITEC  were  developed  as  aids  to  a 
uniform  classification  process.  Established  principles  of  deci- 
sion theory  can  also  be  applied. 

By  assigning  a numerical  value,  v^,  to  each  of  the  several 
measured  parameters,  p^  (display  outputs) , it  is  possible  to 
write  a general  expression  of  the  mathematical  probability  that 
a contact  is  a submarine.  Suppose,  as  is  depicted  in  figure  6, 
the  prior  probability  of  a contact's  being  a submarine  (that 
proportion  of  the  contact  space  which  is  submarines)  is  known. 

The  shaded  portion  of  the  figure  represents  all  those  contacts 
in  the  space  which  display  output  values  v^,  , ...,  v^  or  as 

a set  V.  Note  that  some  are  submarine  contacts  and  some  are  non- 
submarine. The  probability  that  a contact  is  a submarine  is 
derived  directly  from  the  definition  of  a conditional  probability 
and  is  often  referred  to  as  the  inverse  (or  Baye's)  probability 
law, 
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P ( ® I Vj^ » v2, 


' Vn)  = 


P 

I D 


(s)  P (v  , V , . . . , V f s) 

l 2 n 1 

P(v, , v , . . . , v ) 

l 2 n 


which  reads,  " the  probability  that  the  contact  is  submarine; 

given  that  the  value  of  parameter  p^  is  v^,  p^  is  v , " etc. 

If  the  variables,  v^,  range  over  a continuum  of  values, 

P(s|v^,  v2»  vr)  becomes  a probability  density  function.  The 

operator  provided  with  the  prior  probability  density  functions 

P(s),  P(v. , v , v Is)  and  P(v. , v_,  ...,  v ) could,  at  least 

1 2 n1  12  n 

in  principle,  compute  the  probability  that  a particular  contact 
of  measured  parameters  (v^,  v^,  ...,  v ) is  a submarine.  Assuming 
a good  selection  of  parameters,  his  average  performance  depends, 
of  course,  upon  how  well  the  prior  probabilities  represent  the 
current  situation.  In  the  more  practical  approach,  the  sonarman 
depends  upon  his  experience  and  training  to  provide  this  informa- 
tion. 


A data  processing  system  could  be  built  to  perform  the  com- 
putations of  equation  (23) , aiding  or,  possibly,  replacing  the 
sonar  operator  in  this  function.  This  system,  an  inverse  pro- 
bability computer,  provides  storage  for  the  functions  (or  their 
approximations)  on  the  right-hand  side  of  the  equation.  The 
reader  familiar  with  the  TRESI  system  will  immediately  note  its 
similarity  to  an  inverse  probability  computer. 

In  general,  the  conditioned- ref lex  (CR)  concept  represents 
a similar  attack  on  the  automatic  classification  problem;  how- 
ever, closer  inspection  reveals  two  important  differences.  First, 
it  is  oftentimes  not  known  a priori  which  parameters  should  be 
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measured  to  permit  reliable  discrimination  between  several  sig- 
nal classes.  One  can  suggest  hundreds  of  possible  sonar  signal 
measurements  that  have  been  of  value  to  contact  classification; 
e.g.,  doppler,  doppler  derivative,  echo  length,  envelope  rise 
time,  even  sample  levels  themselves,  etc.  In  the  TRESI  system, 
a restricted  number  of  parameters  have  been  selected  and  special 
instrumentation  provided  to  measure  each  parameter.  In  the  CR 
system,  on  the  other  hand,  a large  number  (as  many  as  5000  in 
CONFLEX  I)  of  linear  threshold  measurements  generate  parametric 
data  to  be  used  in  the  classification  process.  These  measure- 
ments are  relatively  simple  to  implement  and  are  all  accomplished 
by  essentially  identical  instrumentation.  An  important  feature 
of  this  approach  is  the  fact  that  the  measurements  are  made  with- 
out involvement  of  a human  operator. 

The  second  difference  between  the  conventional  inverse  pro- 
bability computer  and  the  CR  classification  system  is  that,  in 
the  latter,  cross-correlation  is  used  as  the  basic  decision  func- 
tion. As  is  well  known,  the  correlation  and  inverse  probability 
approaches  are  equivalent  (with  respect  to  a given  set  of  param- 
eters) when  signal  corruption  is  in  the  form  of  additive  Gaussian 
noise,  which,  it  is  emphasized,  is  not  the  case  in  the  sonar 
environment.  The  important  advantage  of  cross-correlation  is 
that  relatively  simple  hardware  is  required  for  implementation 
of  the  computational  algorithms.  This  is  generally  not  true  of 
the  inverse  probability  computer. 
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DATA  PREPARATION 


As  mentioned  previously,  experimental  work  with  the  CON- 
FLEX  I system  requires  that  information  to  be  classified  be 
represented  by  a light  transmission  pattern  on  photographic 
slides.  The  data  provided  for  the  sonar  classification  experi- 
ments consisted  of  analog  waveforms  recorded  on  magnetic  tape. 
Hence,  the  conversion  to  photographic  patterns  involved  certain 
preprocessing  of  the  signal  data. 

Signal  preprocessing  and  slide  preparation  were  greatly 
aided  by  the  help  of  personnel  from  the  Applied  Mathematics 
Laboratory  of  the  David  Taylor  Model  Basin  and  the  availability 
for  this  study  of  their  analog- to-digital  equipment,  the  IBM 
7090  Computing  System,  and  the  GD/Stromberg-Carlson  4020  Micro- 
film Recorder.  Program  funding  had  been  apportioned  by  the  Bureau 
of  Ships  to  the  Model  Basin  for  this  assistance. 

The  sonar  signal  recordings  included  examples  of  both  sub- 
marine and  nonsubmarine  returns.  The  recordings  were  contained 
on  six  reels  of  quarter- inch  tape  - four  of  which  were  chosen 
for  the  classification  experiment.  The  data  was  obtained  from 
the  tape  library  of  the  Sonar  Tape  Analysis  and  Recording  Depart- 
ment at  the  U.  S.  Fleet  Anti-Submarine  Warfare  School,  San  Diego, 
California.  Descriptive  information  relating  to  the  reflectors 
accompanied  the  data. 


Before  digitizing  the  data,  it  was  necessary  to  record  on 
a second  channel  of  the  tape  a negative  trigger  gate  which  was 
used  by  the  analog- to-digital  converter  to  initiate  the  samp- 
ling of  each  return.  Localization  of  the  echoes  was  accomplished 
by  an  experienced  operator  from  an  A- scope  and  audio  display. 

Some  1514  returns  were  prepared  in  this  manner;  17  examples  of 
submarines  (1153  returns)  and  6 examples  of  nonsubmarines  (361 
returns) . 

Figure  7 illustrates  the  several  steps  involved  in  prepro- 
cessing the  sonar  signals.  As  shown  in  the  figure,  the  sonar 
data  was  amplified,  demodulated,  and  passed  through  a 300  cps 
low-pass  filter.  It  was  originally  intended  that  demodulation 
and  filtering  would  be  performed  after  digitizing,  but  it  was 
discovered  that  a disproportionate  amount  of  computer  time  would 
be  required.  Design  of  a suitable  detector  and  low- pass  filter 
by  SCOPE  reduced  the  computer  processing  time  per  return  from 
seven  minutes  to  less  than  one  minute,  a large  portion  of  which 
was  required  to  produce  an  output  format  digital  magnetic  tape 
for  the  microfilm  recorder.  The  raw  sonar  data  and  a demodulated 
and  filtered  return  are  shown  in  photographs  (1)  and  (2)  of 
figure  8. 

Conversion  of  this  waveform  to  digital  form  was  accomplished 
using  a ten  bit  analog- to-digital  converter  which  began  sampling 
on  command  from  the  analog  recorder  and  continued  until  1400 
samples  (700  msec)  were  generated.  A plot  of  these  samples  is 
shown  in  photograph  (3)  of  figure  8.  These  samples  were  recorded 
on  digital  magnetic  tape  and  were  subsequently  read  into  the  IBM 
7090  computing  system  for  further  processing.  The  computer  was 


(1)  Oscilloscope  Photograph  of 
Raw  Data 


T ] 


(3)  Plot  Taken  From  Digitized 
Data 


(2)  Demodulated  - Filtered  Wave- 
form and  Raw  Data  From  Which 
It  Was  Derived* 


(4)  Data  Normalized  And  Quantized 
in  16  Amplitude  Levels 


*This  waveform  was  taken  from  a different  return  from  others 
on  this  page. 

Figure  8.  illustrations  of  Preprocessing  Stages 
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used  to  select  200  msec  of  the  signal  from  the  700  msec  originally 
sampled.  The  particular  200  msec  selected  corresponded  to  100 
msec  on  either  side  of  the  signal  peak  amplitude.  Next,  the 
computer  normalized  the  data  and  requantized  the  400  samples  into 
16  amplitude  levels.  The  result  of  this  operation  is  shown  in 
photograph  (4)  of  figure  8.  Sixteen  levels  were  chosen  so  as  to 
correspond  to  the  number  of  shades  of  gray  that  were  developed 
for  pattern  representation  in  slide  form.  Coordinate  data  (36 
points  per  amplitude  level;  a maximum  of  576  points  per  sample) 
plus  supporting  format  control  data  were  then  placed  on  magnetic 
tape  for  use  by  the  SC  4020  microfilm  recorder.  The  format  for 
the  patterns  was  generated  as  a separate  computer  subroutine. 

The  several  levels  of  gray  on  the  microfilm  were  obtained  by 
assigning  a uniform  point  density  linearly  proportional  to  a 
corresponding  signal  amplitude.  The  progression  of  point  densi- 
ties from  light  to  dark  (corresponding  to  movement  from  low 
amplitude  level 'to  high  amplitude  level)  was  also  linear.  The 
film  field  was  divided  into  400  cells,  each  with  a point  density 
corresponding  to  the  amplitude  level  of  the  signal  sample  to 
which  that  cell  was  assigned.  Cell  assignment  remained  fixed 
for  all  slides  and,  for  the  sake  of  convenience,  was  left-to- 
right,  top-to-bottom. 

The  SC  4020  microfilm  recorder  under  program  control  plots 

20 

on  its  charactron  tube  any  of  68  characters  at  any  of  2 plot- 
ting positions  in  a square  matrix,  then  photographs  the  tube 
automatically  at  speeds  up  to  seven  frames  per  second.  The  pro- 
cessed 35  mm  microfilm  (a  litho  high-contrast)  can  also  be  used 
to  generate  xerographic  hard  copy  of  data  which,  with  proper 
programming,  can  be  produced  in  graphic  or  tabular  form.  If 
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point-density  plots  are  used,  the  coded  photographic  transpar- 
encies are  formed  in  much  the  same  fashion  as  half-tone  prints. 

The  resulting  slide  for  the  waveform  shown  in  figure  8(1) 
is  reproduced  in  figure  9.  It  will  be  noticed  that  the  pattern 
is  divided  into  400  squares  or  cells,  one  per  sensor  input  to 
CONFLEX  I and  each  corresponding  to  an  amplitude  sample  of  the 
waveform  envelope. 

The  SC  4020  microfilm  recorder  is  currently  operated  at 
DTMB  as  an  off-line  equipment  and  must  therefore  take  its  input 
from  digital  magnetic  tape.  A very  large  number  of  plotting 
coordinates  (approximately  120,000  on  the  average),  in  addition 
to  format  control,  had  to  be  read  in  to  generate  a single  slide. 
This  resulted  in  an  unexpectedly  long  time  (about  45  seconds) 
to  generate  the  pattern  and  of  course  increased  the  cost  per 
slide  to  an  unrealistic  level.  The  original  cost  estimate  was 
less  than  ten  cents  per  frame,  conservatively  based  upon  the 
speed  quotation  previously  mentioned  (7  frames/second) . We  have 
since  discovered  that  the  originally  stated  specifications  apply 
to  an  on-line  system. 


We  therefore  felt  it  unwise  to  continue  to  generate  trans- 
parencies at  this  high  cost  in  view  of  the  size  of  the  data  set. 


Our  search  for  a more  realistic  input  buffer  led  us  to  propose 
a more  complex  but  far  mere  efficient  and  flexible  system.  We 
are  pleased  that  construction  of  the  system,  which  utilizes  an 
SDS-925  general-purpose  computer  in  direct  electrical  communica- 


tion with  the  CONFLEX  I was  funded  under  BUSHIPS  Contract  NObsr 
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Amplitude  Levels  for  Coded  Transparency  of  Figure  9 
(u  thru  z and  m represent  10  thru  15  and  16,  respectively.) 
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93231  and  is  now  operating.  This  increased  efficiency  was  par- 
ticularly necessary  due  to  an  order  of  magnitude  increase  in 
the  data  set. 


Nevertheless,  SCOPE  was  directed  to  continue  the  program, 
using  the  SC  4020  to  generate  slides  on  as  much  of  the  data  as 
remaining  funds  allowed.  This  data  was  then  to  serve  as  the 
data  base  for  the  classification  experiments.  Included  in  this 
set  were  205  returns,  132  returns  from  a single  submarine  example 
and  73  returns  from  three  nonsubmarine  examp1 es.  The  submarine 
was  described  in  the  accompanying  legend  as  a beam-to-bow  aspect 
submarine  showing  secondary  echoes.  The  nonsubmarine  examples 
were  described  as  biologies  and  fishnets;  a single,  large  biologic; 
and  kelp  patches  with  good  audio  qualities  and  with  no  change  in 
length.  The  type  of  sonar  and  its  frequency  and  mode  were  not 
given,  however,  we  believe  it  to  be  an  SQS-4  series  sonar  in 
medium  pulse  mode.  Results  of  experiments  with  these  inputs 
are  contained  in  Section  IV. 


SECTION  IV 


EXPERIMENTS 


The  experiments  performed  during  this  study  were  designed 
to  evaluate  the  capability  of  the  CONFLEX  I pattern  recognition 
system  to  distinguish  representations  of  a number  of  sonar  re- 
turns of  submarine  reflectors  from  those  of  nonsuhmarine  reflec- 
tors. The  particular  data  set  consisted  of  205  returns,  132 
returns  from  a single  submarine  example  and  73  returns  from 
three  nonsubmarine  examples.  As  mentioned  previously,  the  sub- 
marine reflector  was  described  in  the  legend  that  accompanied 
it  as  a beam-to-bow  aspect  submarine  showing  secondary  echoes. 

The  nonsubmarine  examples  were  likewise  described  as  biologies 
and  fishnets;  a single,  large  biologic;  and  kelp  patches  with 
good  audio  qualities  and  with  no  change  in  length.  The  data 
provided  for  these  experiments  consisted  of  the  audio  display 
output  recorded  on  magnetic  tapes.  Preprocessing  and  formating 
have  been  detailed  previously  in  Section  III. 

Three  major  experiments  involving  the  submarine  and  non- 
submarine returns  were  run  on  the  CONFLEX  I.  The  first  consisted 
of  training  the  system  with  all  submarine  returns  in  class  S and 
all  nonsubmarine  returns  in  class  NS.  To  test  the  system's  abil- 
ity to  separate  these  two  classes  in  this  "closed  end"  (no  unknowns) 
case,  all  inputs  were  then  applied  and  the  resulting  class  assign- 
ments and  correlations  with  the  reference  functions  were  recorded. 
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In  the  second  experiment,  to  obtain  a measure  of  the  system's 
ability  to  categorize  inputs  which  had  not  been  used  for  training 
(the  "open-ended  case"),  a set  of  66  (the  odd-numbered  submarine) 
returns  from  each  of  classes  S and  NS  was  selected  for  the  con- 
ditioning process.  The  "unknown"  inputs  were  then  applied  for 
class  assignment.  Error  correction  procedures  were  also  explored. 
To  test  the  performance  of  CONFLEX  I when  the  submarine  return 
population  was  subdivided  into  several  classes  according  to  as- 
pect, a third  experiment  was  performed.  Six  aspect  classes  were 
defined  in  the  training  routine.  The  nonsubmarine  population 
was  subdivided  into  three  classes  according  to  the  type  of  con- 
tact. 


CLOSED  END  EXPERIMENT 

Of  the  132  returns  in  the  S class,  one  submarine  return 
was  misclassif ied  and  one  assignment  was  a borderline  case  for 
a correct  S classification  rate  of  98.5%.  (The  borderline  case 
was  counted  as  an  error.)  Four  of  the  73  nonsubmarine  returns 
were  misclassif ied  and  ten  were  borderline  decisions.  Counting 
half  of  the  borderline  cases  as  correct  (corresponding  to  chance 
selection) , nonsubmarine  returns  were  correctly  classified  at 
a rate  of  87.7%.  The  total  correct  classification  rate  for  the 
entire  input  was  94.1%. 

Figure  10  is  a distribution  plot  of  the  correlation  values 
obtained  when  all  inputs  (both  submarine  and  nonsubmarine  returns) 
were  compared  with  the  S class  reference  function.  If  a thres- 
hold level  is  set  at  640  and  all  inputs  which  evoked  a larger 
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correlation  value  are  called  submarine  and  those  inputs  with 
lower  correlation  values  nonsubmarine,  a correct  classification 
rate  for  the  submarine  returns  of  12.0%  is  observed.  The  non- 
submarine return  inputs  yield  84.9%  correct  response.  A lower 
threshold  setting  at  600  yields  a 93.9%  correct  classification 
rate  for  the  submarine  returns  and  35.6%  for  the  nonsubmarine 
inputs.  Note  that  these  results  are  achieved  with  training 
utilizing  only  submarine  returns.  A similar  distribution  is 
plotted  for  the  correlation  values  obtained  when  all  inputs 
were  compared  with  the  NS  class  reference  function.  With  a 
threshold  setting  of  640,  a correct  submarine  classification 
rate  of  87.9%  is  obtained  and  39.7%  for  the  nonsubmarine  returns. 

Curves  can  be  derived  from  each  of  the  above  two  distribu- 
tions which  characterize  the  system  performance  (often  referred 
to  as  ” ROC , " receiver  operating  characteristic,  curves)  when  a 
comparison  of  the  correlation  value  with  a fixed  threshold  is 
used  as  the  decision  criteria.  The  threshold  setting  becomes 
a parameter.  Two  types  of  errors  are  defined:  "error  1”  occurs 

when  the  input  is  called  a member  of  class  NS  when,  in  fact,  it 
is  a member  of  class  S?  "error  2"  occurs  when  the  input  is  called 
a member  of  class  S when,  in  fact,  it  is  a member  of  the  NS  cate- 
gory. Figures  12  and  13  are  the  "ROC"  curves  generated  from 
the  distributions  shown  in  figures  10  and  11,  respectively.  The 
chance  line  always  shows  an  equal  total  number  of  errors.  Notice 
that  the  "best  case"  in  each  of  the  figures  is  77%  and  78%  cor- 
rect response,  respectively.  This  is  somewhat  less  than  the 
94%  correct  classification  rate  achieved  by  CONFLEX  1,  because 
CONFLEX  I uses  a correlation  value  comparator  for  its  assignment 
criteria  rather  than  the  fixed  threshold. 
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we  find  that 


1028  = variance  of  the  in-class  values 


1783  = variance  of  the  out-of-class  values 


An  experimental  r may  be  computed  by  substituting  these  values 
into  the  expression  for  T (equation  19,  Section  I), 


1028  + 1783 


As  noted  in  the  appendix,  a I'  of  1.3  indicates  a probability  of 
correct  response  equal  to  0.9032  or  an  expected  error  rate  of 
968  per  10,000  in  a two-class  experiment.  This  compares  with 
a correct  response  rate  of  0.941  for  the  data  set. 
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OPEN  END  EXPERIMENT 


Of  the  66  submarine  returns  used  in  training  the  S class, 
seven  were  misclassif ied  as  NS  and  one  decision  was  borderline 
for  a correct  S classification  rate  of  87.9%.  (The  borderline 
case  counted  as  an  error.)  Of  the  66  unknown  submarine  returns 
(those  not  used  in  training),  72.7%  were  correctly  classified. 

Two  of  the  66  nonsubmarine  returns  used  in  training  were  called 
S,  and  one  assignment  was  borderline  for  a correct  classifica- 
tion rate  of  95.5%.  (The  borderline  case  again  counted  as  an 
error. ) Six  of  the  seven  unknown  nonsubmarine  returns  were 
correctly  called  NS.  Classification  of  the  entire  set  of  sub- 
marine returns  was  80.3%  correct,  even  though  half  (every  other 
one)  of  these  returns  were  not  used  in  the  training.  Overall 
NS  classification  was  94.5%  correct.  Error  correction  procedures 
improved  these  responses  to  89.4%  and  95 .9%, respectively . 
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MULTICLASS  EXPERIMENT  WITH  ASPECT  ANGLE  GROUPING 

As  mentioned  previously,  a change  in  the  target  angle,  hence 
echo  length,  of  the  reflector  is  one  variation  among  several 
which  serve  to  reduce  the  singularity  of  the  class  reference 
sequence  and  thus  degrade  the  performance  of  the  classification 
system.  The  submarine  example  in  the  data  set  was  described  as 
a "beam-to-bow  aspect  submarine."  It  would  seem  that  the  results 
of  experiment  1 could  have  been  improved  by  conditioning  the 
processor  with  several  groups  of  the  submarine  returns  arranged 
according  to  increments  of  target  angle  rather  than  a full  range 
of  aspects  in  one  class.  This  was  the  case. 
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In  experiment  3,  six  aspect  groups  were  defined  using 
equal  18  degree  increments  beam  to  bow  as  depicted  in  figure  14 
by  training  equal  groups  of  22  returns  each,  first  to  last. 

The  assumption  of  course,  is  that  the  submarine's  rate  of  change 
of  target  angle  was  linear  with  respect  to  the  ping  rate.  We 
have  no  data  describing  the  submarine's  aspect  during  the  run 
other  than  that  given  above.  The  nonsubmarine  returns  were 
also  subdivided  into  three  classes  according  to  the  type  of  con- 
tact. 


All  but  two  of  the  submarine  returns  were  classified  as 
one  of  the  six  submarine  groups  for  a correct  classification 
rate  of  98.5%.  All  but  three  of  the  nonsubmarine  returns  were 
placed  in  one  of  the  NS  categories  for  a correct  nonsubmarine 
classification  of  95.9%.  Response  for  the  entire  data  set  was 
97.6%  correct. 

Individual  aspect  group  responses  were  90.9%,  90.9%,  95.4%, 86.4% 
86.4%  and  90.9%  correct  progressing  from  bow  to  beam.  Individual 
nonsubmarine  class  responses  were  91.3%,  96.0%,  and  72.0%  correct 
for  the  biologies  and  fishnets;  the  single  large  biologic;  and 
the  kelp  patches,  respectively.  Of  the  submarine  returns,  91.7% 
were  associated  with  the  correct  aspect  class;  of  the  nonsubmarine 
returns,  88.4%  were  associated  with  the  correct  nonsubmarine 
example . 

The  borderline  cases  (those  inputs  which  evoke  nearly  the 
same  highest  correlation  with  more  than  one  class)  always  in- 
cluded the  correct  aspect  group  or  nonsubmarine  example  as  one 
of  the  choices.  The  other  choice  was  always  another  aspect  group 
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in  the  submarine  case  and  another  nonsubmarine  example  in  two 
of  three  nonsubmarine  example  borderline  cases.  Half  (4)  of 
the  other  aspect  group  choices  were  an  adjoining  aspect  group 
to  the  correct  class. 

Five  of  the  seven  submarine  return  incorrect  aspect  group 
classifications  were  still  submarine  classifications.  Two  of 
these  were  classifications  as  an  adjoining  aspect  group  to  the 
correct  group.  Four  of  the  seven  nonsubmarine  return  misclassi- 
fications  were  instead  associated  with  another  nonsubmarine 
example.  The  second  choice  (second  highest  correlation  value) 
for  the  seven  submarine  return  incorrect  aspect  group  classifi- 
cations was  in  all  cases  another  submarine  aspect  group,  four 
times  the  correct  submarine  aspect  group  and  once  of  the  other 
three  an  adjoining  aspect  group.  The  second  choice  for  the  seven 
nonsubmarine  return  incorrect  example  classifications  was  in  five 
cases  nonsubmarine  example,  four  of  which  were  the  correct  example 

In  the  cases  (117)  of  correct  submarine  return  aspect  group 
classifications,  the  next  highest  correlation  value  was  another 
submarine  aspect  group  113  times,  an  adjoining  submarine  aspect 
group  59  times  and  a nonsubmarine  example  only  four  times. 

Figures  15  through  19  are  distribution  plots  of  the  corre- 
lation values  obtained  when  all  submarine  return  inputs  were 
compared  with  the  first  submarine  aspect  group.  Each  figure 
shows  the  distribution  plot  for  a different  aspect  group.  Corres- 
pondingly, figure  20  is  a composite  plot  of  the  "ROC"  curves  for 
these  distributions.  The  two  errors  in  these  cases  are  defined: 
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i'igure  17.  Distribution  of  S^,  Aspect  Group  Returns 
with  Reference  Function 
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18.  Distribution  of  S^,  Aspect  Group  Returns 
with  Reference  Function 


-51- 


I 


9B29UQiBta 


1000 


L 


NO  OF  ERRORS  I 

I 2 3 4 3 6 7 8 9 10  II  12  13  14  IS  16  17  18  19  20  21  22 


□ © 

o 


A 

® ■ 


* / 

|A/ 

m 


O s( vs  s2 
Asjvs  s3 
□ s,  vs  s4 
0 «»  vs  sa 
Os,  vs  se 


Figure  20.  Composite  Plot  of  "ROC"  Curves  for  Figures  15-19 
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"error  1"  occurs  when  the  input  is  called  a member  of  aspect 
group  (i  = 2,  3 ...  6)  when  in  fact  it  is  a member  of  aspect 
group  S^;  "error  2"  occurs  when  the  input  is  called  a member 
of  aspect  group  when  in  fact  it  is  a member  of  aspect  group  S^. 

One  final  result  from  this  experiment  is  plotted  in  figure 
21.  The  upper  curve  is  a plot  of  the  percent  change  in  calculated 
echo  length  (transmission  pulse  considered)  as  the  submarine's 
aspect  changes  from  beam  to  bow.  The  lower  curve  represents  the 
percent  decrease  in  the  average  correlation  value  of  the  returns 
from  each  of  the  aspect  groups  when  compared  with  the  refer- 
ence function.  The  percent  decrease  of  each  aspect  group's 
lowest  correlation  value  is  shown  by  the  dotted  curve.  As  ex- 
pected the  trend  is  downward  in  these  latter  curves,  however 
at  not  nearly  the  rate  of  the  change  in  pulse  length.  The  de- 
pendence of  pulse  length  upon  performance  is  not  evident  from 
this  set  of  curves. 
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TWO-CLASS  PROBABILITY  OF  CORRECT  RESPONSE 
AS  A FUNCTION  OF  T 


0.5000 

0.5398 

0.5793 

0.6179 

0.6554 

0.6915 

0.7257 

0.7580 

0.7881 

0.8159 


0.9772 

0.9821 

0.9861 

0.9893 

0.9918 

0.9938 

0.9953 

0.9965 

0.9974 

0.9981 


0.8413 

0.8643 

0.8849 

0.9032 

0.9192 

0.9332 

0.9452 

0.9554 

0.9641 

0.9713 


0.9987 

0.9990 

0.9993 

0.9995 

0.9997 

0.9998 

0.9998 

0.9999 

0.9999 

1.0000 


